Beyond Understanding and Prediction: Data Mining for Action
نویسنده
چکیده
Association analysis and prediction are two major tasks in data mining, and they represent two foremost objectives: data exploration for understanding and model construction for prediction. Data mining is known as a process to convert raw data to useful information --knowledge. However, what do we do with the knowledge discovered from data? We will need knowledge to enable actions, such as preventing diseases in health care, taking actions to retain customers, and etc. One important attribute of knowledge is actionability. In order to be acted on, the knowledge must encode causal relationships to imply the mechanisms of the systems under consideration. Causal inference is a sophisticated topic that spans multiple disciplines, computer science, statistics, medicine, economics, and social science, to name a few. There are a number of well-established frameworks for causal inference in data with assumptions. Unfortunately, the assumptions are not directly testable in data, and hence there are limited off-the-shelf data mining methods for causal discovery. Many practitioners still use conventional machine learning methods for the tasks which actually requires causal inference [1]. It is desirable to have some simple data mining tools to explore causal relationships without or with few assumptions. The discoveries may not be proved causal, but are high quality candidates excluding many non-causal or spurious relationships. This will be a significant step forward in actionable data mining since it is well known that association rule mining generates too many spurious relationships and a classification model often provides a correct prediction based on non-interpretable (or even wrong) evidence. The spurious relationships and non-interpretable or wrong evidence hinder many data mining applications, especially in medical and social science areas. We have been using some well-known causal inference principles to discover causal relationships and building causally interpretable data mining models, such as causal rules [2] and causal decision trees [3], and applying causal discovery methods to real world biological problems [4, 5]. In this talk, I will discuss some of our exploratory work in this promising direction. General Terms Algorithms; Experimentation; Theory.
منابع مشابه
Action mirroring and action understanding in children
The past decade has experienced an increasing interest in action underestanding and children’s mirroring of others’ behavior. Behavioral investigations have focused on the development and significance of mimicry, goal prediction and imitation. Others have focused on the neural basis of action mirroring, identifying particular electrophysiological markers or related brain regions. A vivid debate...
متن کاملAlert correlation and prediction using data mining and HMM
Intrusion Detection Systems (IDSs) are security tools widely used in computer networks. While they seem to be promising technologies, they pose some serious drawbacks: When utilized in large and high traffic networks, IDSs generate high volumes of low-level alerts which are hardly manageable. Accordingly, there emerged a recent track of security research, focused on alert correlation, which ext...
متن کاملPersonal Credit Score Prediction using Data Mining Algorithms (Case Study: Bank Customers)
Knowledge and information extraction from data is an age-old concept in scientific studies. In industrial decision-making processes, the application of this concept gives rise to data-mining opportunities. Personal credit scoring is an ever-vital tool for banking systems in order to manage and minimize the inherent risks of the financial sector, thus, the design and improvement of credit scorin...
متن کاملA data mining approach to employee turnover prediction (case study: Arak automotive parts manufacturing)
Training and adaption of employees are time and money consuming. Employees’ turnover can be predicted by their organizational and personal historical data in order to reduce probable loss of organizations. Prediction methods are highly related to human resource management to obtain patterns by historical data. This article implements knowledge discovery steps on real data of a manufacturing pla...
متن کاملUsing Combined Descriptive and Predictive Methods of Data Mining for Coronary Artery Disease Prediction: a Case Study Approach
Heart disease is one of the major causes of morbidity in the world. Currently, large proportions of healthcare data are not processed properly, thus, failing to be effectively used for decision making purposes. The risk of heart disease may be predicted via investigation of heart disease risk factors coupled with data mining knowledge. This paper presents a model developed using combined descri...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2017